Combining Machine Learning and Active Objects for Parallel Data Mining
نویسنده
چکیده
Nowadays, the necessity and usefulness of the field of Data Mining (DM) and Knowledge Discovery in Databases (KDD) are largely established by both the scientific and industrial communities; and a number of real applications have already been developed in domains ranging from space data to financial analysis [1]. However, the need for scaling up DM algorithms is a natural requirement of the more and more huge size of real world Databases (DB). In this work, we are interested in the task of classification based on rule induction technique. The paper presents the system IFORD (Induction of First Order Rules from Databases), designed for learning first order rules from relational DB. The serial version of IFORD is explained in section 2. The next section explains how active objects are used to allow the parallel execution of the mining process. We conclude the paper by presenting our first experiments and future directions.
منابع مشابه
Evaluating machine learning methods and satellite images to estimate combined climatic indices
The reflections recorded on satellite images have been affected by various environmental factors. In these images, some of these factors are combined with other environmental factors that cannot be distinguished. Therefore, it seems wise to model these environmental phenomena in the form of hybrid indicators. In this regard, satellite imagery and machine learning methods can play a unique role ...
متن کاملPaDDMAS: Parallel and Distributed Data Mining Application Suite
Discovering complex associations, anomalies and patterns in distributed data sets is gaining popularity in a range of scientific, medical and business applications. Various algorithms are employed to perform data analysis within a domain, and range from statistical to machine learning and AI based techniques. Several issues need to be addressed however to scale such approaches to large data set...
متن کاملA Comparative Study of SVM and RF Methods for Classification of Alteration Zones Using Remotely Sensed Data
Identification and mapping of the significant alterations are the main objectives of the exploration geochemical surveys. The field study is time-consuming and costly to produce the classified maps. Therefore, the processing of remotely sensed data, which provide timely and multi-band (multi-layer) data, can be substituted for the field study. In this study, the ASTER imagery is used for altera...
متن کاملFinancial Reporting Fraud Detection: An Analysis of Data Mining Algorithms
In the last decade, high profile financial frauds committed by large companies in both developed and developing countries were discovered and reported. This study compares the performance of five popular statistical and machine learning models in detecting financial statement fraud. The research objects are companies which experienced both fraudulent and non-fraudulent financial statements betw...
متن کاملDetecting Diseases in Medical Prescriptions Using Data Mining Tools and Combining Techniques
Data about the prevalence of communicable and non-communicable diseases, as one of the most important categories of epidemiological data, is used for interpreting health status of communities. This study aims to calculate the prevalence of outpatient diseases through the characterization of outpatient prescriptions. The data used in this study is collected from 1412 prescriptions for various ty...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001